Design and Analysis of an Adjacent Multi-bit Error Correcting Code for Nanoscale SRAMs
نویسنده
چکیده
Increasing static random access memory (SRAM) bitcell density is a major driving force for semiconductor technology scaling. The industry standard 2x reduction in SRAM bitcell area per technology node has lead to a proliferation in memory intensive applications as greater memory system capacity can be realized per unit area. Coupled with this increasing capacity is an increasing SRAM system-level soft error rate (SER). Soft errors, caused by galactic radiation and radioactive chip packaging material corrupt a bitcell’s data-state and are a potential cause of catastrophic system failures. Further, reductions in device geometries, design rules, and sensitive node capacitances increase the probability of multiple adjacent bitcells being upset per particle strike to over 30% of the total SER below the 45 nm process node. Traditionally, these upsets have been addressed using a simple error correction code (ECC) combined with word interleaving. With continued scaling however, errors beyond this setup begin to emerge. Although more powerful ECCs exist, they come at an increased overhead in terms of area and latency. Additionally, interleaving adds complexity to the system and may not always be feasible for the given architecture. In this thesis, a new class of ECC targeted toward adjacent multi-bit upsets (MBU) is proposed and analyzed. These codes present a tradeoff between the currently popular single error correcting-double error detecting (SEC-DED) ECCs used in SRAMs (that are unable to correct MBUs), and the more robust multi-bit ECC schemes used for MBU reliability. The proposed codes are evaluated and compared against other ECCs using a custom test suite and multi-bit error channel model developed in Matlab as well as Verilog hardware description language (HDL) implementations synthesized using Synopsys Design Compiler and a commercial 65 nm bulk CMOS standard cell library. Simulation results show that for the same check-bit overhead as a conventional 64 data-bit SEC-DED code, the proposed scheme provides a corrected-SER approximately equal to the Bose-ChaudhuriHocquenghem (BCH) double error correcting (DEC) code, and a 4.38x improvement over the SEC-DED code in the same error channel. While, for 3 additional check-bits (still 3 less than the BCH DEC code), a triple adjacent error correcting version of the proposed code provides a 2.35x improvement in corrected-SER over the BCH DEC code for 90.9% less ECC circuit area and 17.4% less error correction delay. For further verification, a 0.4-1.0 V 75 kb single-cycle SRAM macro protected with a programmable, up-to-3-adjacent-bit-correcting version of the proposed ECC has been fabricated in a commercial 28 nm bulk CMOS process. The SRAM macro has undergone neutron irradiation testing at the TRIUMF Neutron Irradiation Facility in Vancouver, Canada. Measurements results show a 189x improvement in SER over an unprotected memory with no ECC enabled and a 5x improvement over a traditional single-error-correction (SEC) code at 0.5 V using 1-way interleaving for the same number of check-bits. This is comparable with the 4.38x improvement observed in simulation. Measurement results confirm an
منابع مشابه
An approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کاملDouble Error Correcting Codes
The construction of optimal linear block errorcorrecting codes is not an easy problem, for this, many studies describe methods for generating good. That is, even when taking into account the multitude of errors possible for multilevel quantum systems, topological quantum error-correction codes employing. The point of error correcting codes is to control the error rate. Let's say you have a Sing...
متن کاملPerformance Analysis of Forward Error Correction Codes used for Transmission of data considering Failures in Coder/Decoder Circuits
Reliable and efficient transmission of data over a wireless network has been an increasing demand for the past few years. Whereas, major concern is the control of errors and to obtain reliable reproduction of data .This is achieved by using a variety of error control codes which improves BER performance in digital communication systems and are important in wireless information networks. Perform...
متن کاملCooperative MIMO Communication at Wireless Sensor Network: An Error Correcting Code Approach
Cooperative communication in wireless sensor network (WSN) explores the energy efficient wireless communication schemes between multiple sensors and data gathering node (DGN) by exploiting multiple input multiple output (MIMO) and multiple input single output (MISO) configurations. In this paper, an energy efficient cooperative MIMO (C-MIMO) technique is proposed where low density parity check ...
متن کاملHardware Implementation of Single Bit Error Correction and Double Bit Error Detection through Selective Bit Placement for Memory
Hamming codes are widely used for the single bit error correction double bit error detection (SEC-DED) which occurred during data transmission process. This paper presents an enhanced detection of double adjacent bit errors and correcting all possible single bit errors in Hamming codes through selective bit placement technique for memory application. Soft errors occur due to the radiation parti...
متن کامل